-> root -> system -> ::system::debugging
Notes, material, stuff... regarding the debugging of your system.
Notes on this page:

"strace: out of memory" error
[2]

If you are stracing a process and are getting an error like "strace: out of memory", watch out: this is not an error in the application, but an error in strace itself.

This message indicates that strace does not have enough memory to create its own structures to strace your process.

Certain versions of strace, still quite widespread as for year 2005/2006, have a bug in tracing multithreaded applications, which confuse strace about the memory it will need to strace the process.

To verify your application is multithreaded, you can use something like "ps -L aux". If your process as more than one line with the same PID, than it is multithreaded.

Updating strace might solve the problem. Another solution might be to start stracing your process before it spawns other threads. So, don't use the strace '-p' parameter, but strace the process since it starts.

tcpdump and -i any
[13]

On recent Linux kernels, tcpdump can listen on multiple interfaces.

In order to do so, just specify the 'any' virtual interface with something like:
# tcpdump -i any 
  

When the 'any' interface parameter is specified, interfaces are not put into promiscuos mode.

To manually set interfaces into promiscuos mode, just use something like:
# ifconfig eth0 promisc
or
# ip link set eth0 promisc on

Note that we don't know any way to specify selectively a list of interfaces to listen on, and we don't know any way to have an indication of the name of the interface on which a given packet was captured.

A workaround could be the '-e' parameter, to have link-level headers dumped. Note, however, that link-level headers may easily be spoofed or just wrong.

This note is available in the following categories:

Debugging application related MTU/MSS problems...
[16]

Ok, so you just wrote your wonderful daemon that listens on a given sockets and provides some nice service over the network?

With manual file descriptors handling (using open, write, read, ...) it is easy to make errors in handling buffers and network conditions correctly.

If you had reports about people not being able to use your software from certain internet locations, or you are ready to release a new version of your software, you should check the correct behavior of network related routines with different MTU and packet sizes. Small packets can easily trigger buffering errors, or make some of the assumptions you made about your code miserably fall over below your floor.

One easy way to check the behavior with different MTUs is just to set the loopback MTU to a different size, and then perform some tests on the application by connecting to 127.0.0.1.

To know more about how to set the MTU of an interface, please refer to http://notes.inscatolati.net/[en]/system[en]/networking[en]/index.html#15.

This note is available in the following categories:

Sys::Syslog error '/dev/conslog not writable' and perl leaving...
[22]

Ok, symptoms:

  • Your script dies with an error similar to the following:
      stream /dev/conslog is not writable at ...
      console is not writable at ...
      no connection to syslog available ...
      

  • If you run your script as 'root', everything works perfectly, while as a normal user syslog does not seem to work... no! users should be allowed to log messages! it's not a privileged operation! don't ever think about that!

  • 'logger' or similar commands seems to work and log fine, without troubles, both as users and administrators... I used something like:
    $ logger -t test -p local0.notice 'test'
    $ su
    # logger -t test -p local0.notice 'test'
          

Well, you should:
  • make sure syslog is properly configured and running and that devices/unix sockets have the correct permissions, by checking that the above command ('logger ...') works correctly.

  • make sure your code calls the 'openlog' functions once and only once. On my perl 5.8.4 Linux System, I had two different modules calling the 'openlog' function... and well, as 'root', everything worked just fine, but as soon as I switched to an unprivileged user, the second 'openlog' caused the error described.

Other workarounds? Have I been a fool? Am I totally wrong? well, removing the second openlog call just solved my problems, so, let's not bother about investigating what's going wrong...

This note is available in the following categories:

Debugging unix domain sockets
[38]

Ever found yourself trying to debug an application that relies on unix sockets as the way of communication? Well, a tool that often comes handy is "socat". Socat is a bit like netcat, but supports all sorts of connection styles, socket types, and so on. Try things like:
   socat STDIN UNIX-CONNECT:/tmp/unix-socket
 
socat is great, it has libreadline support (history and so on) and allows to set / change almost every parameter of any socket types. Have a peek in its man page. It is also useful from bash scripts, if you want to do any kind of connection handling. But... you'd probably better off with a real programming language if that's what you want to do. Stop abusing your shell.

Logging the boot output
[45]

You have an init script failing with some weird error? The console is remote, and you need to check what's happening during the boot process? In Debian, nowdays, you can enable bootlogd to save whatever is outputted on the console during the boot process. In /etc/defaults/bootlogd, just add the line:
 BOOTLOGD_ENABLE=yes
 
Not everything is logged, but it's much better than not having it. Just check dmesg, and the file /var/log/boot.

Debugging an initrd (or an unbootable system...)
[51]

initrds created with mkinitramfs, mkinitrd, yaird or any other tool can sometime contain errors that make your system un-bootable, or that output error during the boot process. It is usually a pain to debug those errors, as... you often don't have a shell, needed softwares are missing from the initrd, ...

Generally, there are two approaches that can be easily used to debug an initrd:

  • uncompress the initrd, and have a peek in the scripts to see what's being done and what it is doing when the init process stops (and why) - always works.

  • at boot time, have the initrd output some debugging lines, or get a prompt to try to manually understand what's going wrong or why the commands are failing. Using this method requires support from the tool used to create the initrd, so it won't be discussed in this note.

So, to access the content of an initrd, you need to run a few commands depending on the format of the initrd itself. Nowdays, most initrds are either gzip-compressed cpio files, cramfs or other more or less esotheric file systems. You can start with something like:
   % file -Ls /boot/initrd.img
   
If you are lucky, it will be a cramfs:
   /boot/initrd.img-2.6.8-3-686-smp: Linux Compressed ROM File System data, [...]
   
just mount it with something like:
   % mount -o loop /boot/initrd.img /mnt/whatever
   
If you are a bit less lucky, it will be gzip compressed:
   /boot/initrd.img: gzip compressed data, from Unix,
   
Start by just uncompressing it:
   % gzip -cd < /boot/initrd.img > /tmp/initrd.uncompressed
   
Repeat the "file" command above against /tmp/initrd.uncompressed. If you are lucky, again, it will be a filesystem. Just mount it with the same "mount -o loop..." as above. If you are a bit less lucky, you will see something like:
   $ file -sL /tmp/initrd.uncompressed
   /tmp/initrd.uncompressed: ASCII cpio archive (SVR4 with no CRC)
   
which means that the initrd is a simple cpio archive. Uncompress it with something like:
   $ cpio --extract --make-directories < /tmp/initrd.uncompressed
   
Now, you can start your debugging by either looking into /linuxrc, /init, or /sbin/init. You can obviously modify the filesystem, and reverse the steps to create a new initrd to test. Good luck!

This note is available in the following categories:
Generated by CRON on 2012/02/14 at 06:26:35.